Fast Clustering Analysis of Inhomogenous Megapixel CMB Maps 
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ABSTRACT 

Szapudi et al. (2001) introduced the method of estimating angular power spectrum of the 
CMB sky via heuristicaUy weighted correlation functions. Part of the new technique is that all 
(co)variances are evaluated by massive Monte Carlo simulations, therefore a fast way to measure 
correlation functions in a high resolution map is essential. This letter presents a new algorithm to 
calculate pixel space correlation functions via fast spherical harmonics transforms. Our present 
implementation of the idea extracts correlations from a MAP-like CMB map (HEALPix resolution 
of 512, i.e. ~ 3 X 10^ pixels) in about 5 minutes on a 500MHz computer, including Ci inversion; 
the analysis of one Planck-like map takes less then one hour. We use heuristic window and 
noise weighting in pixel space, and include the possibility of additional signal weighting as well, 
either in i or pixel space. We apply the new code to an ensemble of MAP simulations, to test 
the response of our method to the inhomogenous sky coverage/noise of MAP. We show that the 
resulting C^'s are very close to the theoretical expectations. The HEALPix based implementation 
of the method, SpICE (Spatially Inhomogenous Correlation Estimator) will be available to the 

public from the authors. 
Introduction 



With the successful launch of MAP and the ad- 
vancing Planck schedule, megapixel Cosmic Mi- 
crowave Background (CMB) maps are just around 
the corner. To fully realize their potential of con- 
straining cosmological parameters within a few 
percent (e.g., Spergel 1994; Knox 1995; Hinshaw, 
Bennett, & Kogut 1995; Jungman et al. 1996; 
Zaldarriaga, Spergel & Seljak 1997; Bond, Efs- 
tathiou, & Tegmark 1997), data analysis methods 
have to face hitherto unprecedented challenges. 
The standard maximum likelihood developed for 
COBE (e.g., Gorski 1994; Gorski et al. 1994, 
1996; Bond 1995; Tegmark & Bunn 1995; Tegmark 
1996; Bunn & White 1996; Bond, Jaffe, & Knox 
1998, 2000) is already pushing supercomputers for 
balloon-borne and ground based experiments with 
TV ~ 10'"^ pixels (de Bernardis et al. 2000; Net- 
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terfield et al. 2001; Hanany et al. 2000; Jaffe et 
al. 2000; Martin et al. 1996; Miller et al. 1999; 
Peterson et al. 2000), and it is clearly inadequate 
for analysing megapixel maps on any future super- 
computer. This letter addresses an important step 
in the full data processing pipeline from compo- 
nent separation/mapmaking to cosmological pa- 
rameter estimation: the fast estimation of the an- 
gular power spectrum, Ci from megapixel map 
with realistic geometry (sky cut, cut out holes), 
and inhomogeneous coverage and noise. 

A recent surge of activity motivated by the ob- 
vious need for fast alternatives to the standard 
maximum likelihood estimation produced an array 
of promising techniques. Several of them use sym- 
metries of a particular experiment to gain speed. 
An experiment specific technique for MAP was 
developed by Oh, Spergel and Hinshaw (1998) 
making use of the planned approximate azimuthal 
symmetry of the coverage/noise. Similar devel- 
opments are under way for Planck (e.g., Wan- 
delt 2000) possibly exploiting symmetries in its 
scanning strategy. Other techniques give up op- 
timality in favor of speed. Correlation functions 
with simple weights were advocated by Szapudi et 
al. (2001, hereafter SPPSB), while an analogous 



method based on direct spherical Fourier trans- 
form was developed by Hivon et al. (2001). In the 
standard CMB formalism (e.g. Bond et al. 1998) 
both of these can be thought of as quadratic es- 
timators with suboptimal weights, yet, they tend 
to produce results very close to optimal. 

This letter presents significant improvements of 
the correlation function method. We present a fast 
algorithm for extracting the pixel space estimator 
using fast spherical harmonics transform, intro- 
duce noise weighting, test the use of Monte Carlo 
realizations to calculate noise correlation functions 
in the estimator, and finally illustrate the possibil- 
ity of additional signal weighting. 



2. Summary of the Method 

Our latest recipe to extract C^'s from a large 
CMB map is an improved version of SPPSB: ex- 
tract the two-point correlation function with an 
unbiased weighted estimator sampled at the roots 
of Legendre polynomials, then integrate with a 
Gauss-Legendre quadrature to obtain the Cis. 
Next we review the correlation function estimator. 

Let us denote the temperature fluctuations at a 
sky vector n, a unit vector pointing to a pixel on 
the sky, with T{n). In isotropic universes the two- 
point correlation function is a function only of the 
angle between the two vectors and can expanded 
into a Legendre series. 
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Fig. 1. — CPU requirements of selected CMB 
analysis methods are plotted: MADCAP (TV^, 
filled triangles, BorriU 1999), SPPSB [N^, open 
triangles), and the present method (~ N^-^^ , filled 
squares). The experiment specific technique of 
OSH for MAP is plotted with a cross, while an 
open square illustrates how Moore's law will shift 
the CPU for our method by 2007. All the squares 
are actual measured values on a 500Mhz CPU, 
while the triangles are based on extrapolation of 
scaling. The two leftmost points are relevant for 
MAP and Planck. The righthand y-axis displays 
CPU time in years, instead of seconds for clarity. 



a^ = {T{n,)T{n^)) = ^ i±i^C,F,(cos0), 

(1) 
where con 9 = ni.n2 is the dot product of the 
two unit vectors, Pi{x) is the i-ih Legendre poly- 
nomial, and the Ci coefficients realize the angu- 
lar power spectrum of fluctuations. If the CMB 
anisotropy is Gaussian, which is expected to be 
an excellent approximation, the correlation func- 
tion, or the C^'s, yields full statistical description. 
In reality each pixel value. A;, contains con- 
tributions from the CMB and noise; the latter is 
also assumed to be Gaussian with a correlation 
matrix (the noise matrix) iVy, determined during 
map-making. The full pixel-pixel correlation ma- 



trix is Ci. 



?. 



A',,-, matrix, as a sum of noise 



correlation functions measured in a set of noise re- 
alizations. Therefore we use the modified version 
of the estimator by SPPSB 
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where n^ is one of M realizations of the noise for 
pixel i. Our estimator can be extracted efficiently 
and accurately for a general noise matrix, as long 
as a fast noise generator exists. The weights /y 
merit further discussion. For fij ~ {S + N)~'^ the 
above corresponds to an optimal quadratic estima- 
tor (where S denotes the matrix corresponding to 
^ij). Unfortunately, the cost to calculate the op- 
timal weights would be prohibitive as it is for any 
other version of the optimal quadratic estimator; 
speed is gained by using heuristic weights instead 



of the optimal ones. SPPSB used window weigh- 
ing, fij = unless the pair of pixels belong to a 
particular bin in cos9, and J^ij fv = 1- The anal- 
ysis of the next section uses more general noise 
weighting. Other possibilities exist and are worth 
to explore, however, they can only affect slightly 
the error bars of our unbiased estimator. The next 
section demonstrates that heuristic noise weight- 
ing produces results close to the theoretical expec- 
tations. Heuristic window weighting is natural in 
pixel space where the window is diagonal, and it 
amounts to edge effect correction (Szapudi & Sza- 
lay 1998). Heuristic noise weighting takes this idea 
one step further, and it is again natural in pixel 
space as long as the noise is localized as it is for 
MAP. It is worth reemphasizing that azimuthal 
symmetry is not required. 




Fig. 2. — The figure displays the average correla- 
tion function in a simulated MAP survey (black) 
and one realization (red). The inserted panel 
shows the average noise correlation function (line) 
with one realization (diamonds). 



3. Fast Estimation of the Correlation 
Ftinction at Legendre Roots 

The novel correlation function estimator is 
based on fast spherical harmonics decomposition; 
our aim is to make use of their speed, based on 
the FFT's of isolatitude pixel annuli. Practical 
implementation of our estimator is based on the 
fast transforms available in the HEALPix ^ pack- 
age. Let us define two unit vectors with angle 9 



apart, e.g., ni = (0,0), and n2 — (0,0) in spheri- 
cal coordinates. In the continuum limit, the raw 
weighted pairwise estimator is an average over the 
rotation group 

Dfs{cos9) = ^ J dgfS{R{g)ni)fS{R{g)n2), 

(3) 
where the integration is over S0(3), Stt^ is the full 
volume of the Lie group, and R(g) is the trans- 
formation corresponding to the Lie group element. 
The sum in equation (2) is replaced by an integral, 
multiplicative weighting is assumed: fij = fifj, 
and fS ~ /jAi (both taken at the same vector), 
corresponds to the weighted temperature fluctua- 
tions in the continuum limit. After a simple calcu- 
lation based on the general orthogonality of group 
representations 



Dfs{cose)^Yl 



aim P —Peicose), (4) 
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^ http://www.eso.org/kgorski/hcalpix 



where the azm's are the coefhcients of the spherical 
harmonics decomposition of fS. The ensemble av- 
erage of the equation yields (Dfs) = (^ + N)Df, 
where we introduced the weight correlation func- 
tion Df, calculated similarly to Dfs- The un- 
biased correlation function estimator can be ob- 
tained as 

~ = Dfs{co.s0) - DMCOS0) ^ 

Df(cos&) 

where Df„ is the average raw noise correlation 
function calculated from M simulations; obviously 
(^) = ^. According to equation (4) this estimator 
is calculated via two spherical harmonics trans- 
forms in discrete pixel space, with the summation 
performed up to (.max- This is a fast realization of 
the estimator in equation (2) with multiplicative 
weights. 

We cross-compared the above algorithm with a 
naive A''^ correlation code, and the results were 
identical. To obtain the angular power spectrum, 
we then sample the above equation exactly at 
the roots of Legendre polynomials, and perform 
Gauss-Legendre integration, as done by SPPSB. 

The expected scaling is N^/'^logN , correspond- 
ing to an effective A^^-^^ slope for the present im- 
plementation in the range of A^ we tested. The 
speed between calculating the correlation esti- 
mator and the Ci inversion is divided approxi- 
mately evenly; on a SOOMhz Dec alpha our present 



HEALpix compatible implementation takes ~260 
seconds, Planck analysis is feasible in about 40 
minutes. 

The above procedure appears unnatural at first: 
if we are using FFT's, and our final goal is to esti- 
mate the angular power spectrum, why do we leave 
Fourier space at all? The reason is simple: window 
and noise are localized in pixel space, thus we can 
construct an unbiased estimator with a simple nor- 
malization. Such a normalization in pixel space is 
equivalent to an edge effect correction, and heuris- 
tic weights can be constructed intuitively and nat- 
urally. In principle an analogous procedure can 
be designed in (, space, but the non-diagonality of 
the window function results in a complex coupling 
matrix infused with Wigner 3j symbols, as Hivon 
et al. (2001) have shown in a tour de force cal- 
culation. The computation and inversion of this 
matrix to unbias the estimator is highly non-trivial 
numerically, especially in the presence of noise and 
complex geometry; so far they have demonstrated 
numerical feasibility for a relatively simple ellip- 
soidal window without any noise weighting. It is 
not inconceivable that this procedure can be suc- 
cessfully extended to more general windows and 
noise weighting, but at a price which can be re- 
garded as unnecessary complexity; therefore we 
recommend the simple technique of constructing 
an unbiased weighted pixel space estimator in- 
stead. 

4. Application: MAP Simulations 

We have generated 1200 MAP simulations us- 
ing HEALPix with inhomogenous sinusoidal cov- 
erage assuming a detector sensitivity of 20 fiK for a 
0.3° X 0.3° pixel, taken from the MAP homepage, 
(http://map.gsfc.nasa.gov/). We assumed a 
Gaussian beam of 18 arcminutes, we neglected 
any sidelobes or asymmetries. Our simulations are 
similar, although not perfectly equivalent to that 
of OSH. In addition to the noisy MAP simulations, 
we have generated 1200 pure noise simulations to 
be used in our estimator of Equation (2). Our 
noise weighting was inversely proportional to the 
expected noise in a pixel, /y ~ l/{aP.aP_), with 
p = 1 motivated by prewhitening, and p = 2 hy 
approximate minimum variance estimator. Both 
performed quite similarly, we show the results 
from p = 2. We used a galactic cut of ±20 de- 



grees. 

The noise correlations were calculated with the 
code and were subtracted from the correlation 
function according to equation (2). While the 
noise was assumed to be diagonal, our practical 
implementation does not make use of this prop- 
erty, except in the heuristic noise weighting we 
adopted. If this assumption would break, our 
method would not be affected, as long as the sim- 
ulations can be generated quickly. 




Fig. 3. — Ci measurements in MAP simulations 
are illustrated. The blue line is the underlying 
theory, while the overlapping black solid line cor- 
responds the the average of 1200 measurements; 
this explicitly demonstrates the unbiased nature 
of our technique. The pair of black lines opening 
up for high ^ displays approximate theoretical er- 
ror bars, according to equation (6). The magenta 
lines are error bars calculated from 1200 simula- 
tions. They track the theoretical error bars quite 
well. 

The Ci integration was performed via a Gauss- 
Legendre quadrature. Beam and pixel window 
functions were accounted for as usual. The results 
are displayed in Figure 3. Along with the average 
C^, the theoretical input, as well as theoretical and 
measured errors are displayed, the former from the 
following approximation (Hivon et al. 2001) 

AC, = yi (C, -t- B{efW{lfN{l)) , (6) 

where Vi = {2i + l)fskyW2/^i with Wm = 
j^ J (Klf"^(n), the m-th moments of the weight 
function fi, B {£) , W {£) , N {£) represent beam, 



pixel, and noise effects, respectively, and fsky is 
the sky fraction in the map. According to Fig- 
ure 3 our method is unbiased, and accurate for 
MAP: the error bars track closely the theoretical 
expectation within the accuracy of the approxi- 
mate theory. The suboptimality of our technique 
is expected to be small for MAP-like surveys, but 
this needs to be confirmed by detailed comparison 
to optimal techniques. Our figure is similar to 
Figure 4 of Oh et al. (1999), but there are small 
differences: it appears that we have assumed a 
slightly higher noise and/or wider beam. It would 
be worthwhile to perform direct comparison of the 
two methods to assess the degree of suboptimal- 
ity of ours, and the sensitivity of the experiment 
specific technique to a potential breakdown of the 
underlying assumptions. 

5. Summary and Discussions 

We have demonstrated that i) correlation func- 
tions with generalized edge correction can be ex- 
tracted with full accuracy for a megapixel map in 
minutes on a modest personal workstation ii) the 
resulting Ci computed from Monte-Carlo realiza- 
tions of MAP observations with inhomogeneous 
coverage agree quite well with the (approximate) 
theoretical error bars inferred from the moments of 
the window function applied to the data to con- 
struct our estimator, as well as the cosmic vari- 
ance. The results should be very close to optimal 
at high £ where the noise variance become domi- 
nant provided that the noise matrix is diagonally 
dominant, but this needs to be confirmed by a de- 
tailed comparison with optimal methods. The the- 
oretical scaling of our method is iV'^/^logA^, with 
a measured scaling of ^ TVi-^i in our present im- 
plementation, for the range of N we tested. 

In the previous tests we have used MAP-like di- 
agonal noise/coverage, no other (e.g., azimuthal) 
symmetries were assumed about noise, or geom- 
etry of the map. Cut out holes around bright 
sources, or any irregularity in the sampling, rep- 
resent only minor perturbations to our method, 
therefore they should have no discernible effect on 
its speed or performance; the investigation of this 
point in sufficient detail is left for future work. 
Our approach is straightforward to generalize for 
non-diagonal noise matrix, specially in the case 
where the off-diagonal elements of the noise ma- 



trix correspond primarily to pairs of pixels of fixed 
angular separation, as is expected for MAP due to 
the differential nature of the measurement. In the 
general case, our main assumption is the fast gen- 
eration of noise realizations in pixel space, which 
will be supplied by fast mapmaking methods (e.g. 
Wright 1996; Prunct et al. 2001). 

At the heart of our method is the use of heuris- 
tic weighting corresponding to general edge ef- 
fect correction. (Indeed, noise can be considered 
as a fuzzy window, or window can be considered 
as infinite noise). The simplicity of our weight- 
ing amounts to a substantial gain in speed over 
the costly optimal weighting {S + N)'"^ . In ad- 
dition, the use of spherical harmonics decompo- 
sition facilitates heuristic signal weighting, which 
is natural in £ domain. The sum of the a/m's in 
equation (4) can be recognized as a pseudo Ci of 
the weighted fluctuations estimated with uniform 
weights in £ space. For low £'s, for complicated 
noise and window patterns, it might be desirable 
to sum the ajm's with weights obtained from the 
correlation matrix < ai-^rniCii2m2 > evaluated from 
Monte Carlo simulations. This matrix can be di- 
agonalized e.g. up to a certain £, or within a band 
of A£, to calculate approximate (semi-heuristic) 
weights for our estimator of the correlation func- 
tion. While such improvements do not seem neces- 
sary for MAP, a clear upgrade path for our method 
exists for more complex experiments if needed. Al- 
though more natural in £ space, heuristic signal 
weighting is possible in pixel space as well (e.g., 
Colombi, Szapudi, & Szalay 1998). 

Our technique has further potential besides 
speeding up C'l estimation: it opens up a full ar- 
ray of possible applications and generalizations. 
These include cross-correlations (between chan- 
nels for component separation, between LSS and 
CMB, B-typc polarization and lensing, etc) non- 
Gaussianity (e.g. 3-point function/bispectrum, 
cumulant correlators for SZ, lensing, etc), vector 
and tensor correlations (for polarization). The 
idea of real space window and noise weighting is 
applicable to power spectrum estimators of galax- 
ies, clusters, lensing etc. as well. A Euclidian ver- 
sion of our algorithm is entirely analogous to the 
spherical case. It will be useful for fast edge and 
noise corrected estimation of the power spectrum, 
bispectrum, A^-point correlation function and cu- 
mulant correlators in galaxy catalogs. These gen- 



eralizations presently under implementation will 
be discussed in subsequent papers and included in 
a later version of SpICE. 
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